T-Test Results - Generated on 2025-05-16 00:43:29

Comparison: Zero-Shot (ZS) vs Chain of Thought (COT) (Unmasked Strategies - Filtered for models with both ZS & COT data)
  Comparison: ZS Accuracy (Filtered) vs COT Accuracy (Filtered)
    T-statistic: -4.0813
    P-value: 0.0028
    Result: Statistically significant (p < 0.05)

Comparison: Perturbation Types (ZS Prompt Strategy)
  Comparison: Unmasked vs Masked
    T-statistic: -0.3511
    P-value: 0.7255
    Result: Statistically not significant (p >= 0.05)

  Comparison: Unmasked vs Obfuscated
    T-statistic: 3.3395
    P-value: 0.0008
    Result: Statistically significant (p < 0.05)

  Comparison: Masked vs Obfuscated
    T-statistic: 3.6913
    P-value: 0.0002
    Result: Statistically significant (p < 0.05)

Comparison: Deterministic vs Stochastic Strategies (ZS Unmasked)
  Comparison: Deterministic Accuracy vs Stochastic Accuracy
    T-statistic: 8.0965
    P-value: 0.0000
    Result: Statistically significant (p < 0.05)

